Goto

Collaborating Authors

 Plateau State


Entropic Causal Inference: Graph Identifiability

Compton, Spencer, Greenewald, Kristjan, Katz, Dmitriy, Kocaoglu, Murat

arXiv.org Artificial Intelligence

Entropic causal inference is a recent framework for learning the causal graph between two variables from observational data by finding the information-theoretically simplest structural explanation of the data, i.e., the model with smallest entropy. In our work, we first extend the causal graph identifiability result in the two-variable setting under relaxed assumptions. We then show the first identifiability result using the entropic approach for learning causal graphs with more than two nodes. Our approach utilizes the property that ancestrality between a source node and its descendants can be determined using the bivariate entropic tests. We provide a sound sequential peeling algorithm for general graphs that relies on this property. We also propose a heuristic algorithm for small graphs that shows strong empirical performance. We rigorously evaluate the performance of our algorithms on synthetic data generated from a variety of models, observing improvement over prior work. Finally we test our algorithms on real-world datasets.


A Study of Data-driven Methods for Inventory Optimization

Ping, Lee Yeung, Wong, Patrick, Han, Tan Cheng

arXiv.org Artificial Intelligence

This paper shows a comprehensive analysis of three algorithms (Time Series, Random Forest (RF) and Deep Reinforcement Learning) into three inventory models (the Lost Sales, Dual-Sourcing and Multi-Echelon Inventory Model). These methodologies are applied in the supermarket context. The main purpose is to analyse efficient methods for the data-driven. Their possibility, potential and current challenges are taken into consideration in this report. By comparing the results in each model, the effectiveness of each algorithm is evaluated based on several key performance indicators, including forecast accuracy, adaptability to market changes, and overall impact on inventory costs and customer satisfaction levels. The data visualization tools and statistical metrics are the indicators for the comparisons and show some obvious trends and patterns that can guide decision-making in inventory management. These tools enable managers to not only track the performance of different algorithms in real-time but also to drill down into specific data points to understand the underlying causes of inventory fluctuations. This level of detail is crucial for pinpointing inefficiencies and areas for improvement within the supply chain.


Nocturnal eye inspired liquid to gas phase change soft actuator with Laser-Induced-Graphene: enhanced environmental light harvesting and photothermal conversion

Sogabe, Maina, Kim, Youhyun, Kawashima, Kenji

arXiv.org Artificial Intelligence

Robotic systems' mobility is constrained by power sources and wiring. While pneumatic actuators remain tethered to air supplies, we developed a new actuator utilizing light energy. Inspired by nocturnal animals' eyes, we designed a bilayer soft actuator incorporating Laser-Induced Graphene (LIG) on the inner surface of a silicone layer. This design maintains silicone's transparency and flexibility while achieving 54% faster response time compared to conventional actuators through enhanced photothermal conversion.


Which Nigerian-Pidgin does Generative AI speak?: Issues about Representativeness and Bias for Multilingual and Low Resource Languages

Adelani, David Ifeoluwa, Doğruöz, A. Seza, Shode, Iyanuoluwa, Aremu, Anuoluwapo

arXiv.org Artificial Intelligence

Naija is the Nigerian-Pidgin spoken by approx. 120M speakers in Nigeria and it is a mixed language (e.g., English, Portuguese and Indigenous languages). Although it has mainly been a spoken language until recently, there are currently two written genres (BBC and Wikipedia) in Naija. Through statistical analyses and Machine Translation experiments, we prove that these two genres do not represent each other (i.e., there are linguistic differences in word order and vocabulary) and Generative AI operates only based on Naija written in the BBC genre. In other words, Naija written in Wikipedia genre is not represented in Generative AI.


Corn Yield Prediction Model with Deep Neural Networks for Smallholder Farmer Decision Support System

Olisah, Chollette, Smith, Lyndon, Smith, Melvyn, Morolake, Lawrence, Ojukwu, Osi

arXiv.org Artificial Intelligence

Given the nonlinearity of the interaction between weather and soil variables, a novel deep neural network regressor (DNNR) was carefully designed with considerations to the depth, number of neurons of the hidden layers, and the hyperparameters with their optimizations. Additionally, a new metric, the average of absolute root squared error (ARSE) was proposed to address the shortcomings of root mean square error (RMSE) and mean absolute error (MAE) while combining their strengths. Using the ARSE metric, the random forest regressor (RFR) and the extreme gradient boosting regressor (XGBR), were compared with DNNR. The RFR and XGBR achieved yield errors of 0.0000294 t/ha, and 0.000792 t/ha, respectively, compared to the DNNR(s) which achieved 0.0146 t/ha and 0.0209 t/ha, respectively. All errors were impressively small. However, with changes to the explanatory variables to ensure generalizability to unforeseen data, DNNR(s) performed best. The unforeseen data, different from unseen data, is coined to represent sudden and unexplainable change to weather and soil variables due to climate change. Further analysis reveals that a strong interaction does exist between weather and soil variables. Using precipitation and silt, which are strong-negatively and strong-positively correlated with yield, respectively, yield was observed to increase when precipitation was reduced and silt increased, and vice-versa.


Unifying (Quantum) Statistical and Parametrized (Quantum) Algorithms

Nietner, Alexander

arXiv.org Machine Learning

Kearns' statistical query (SQ) oracle (STOC'93) lends a unifying perspective for most classical machine learning algorithms. This ceases to be true in quantum learning, where many settings do not admit, neither an SQ analog nor a quantum statistical query (QSQ) analog. In this work, we take inspiration from Kearns' SQ oracle and Valiant's weak evaluation oracle (TOCT'14) and establish a unified perspective bridging the statistical and parametrized learning paradigms in a novel way. We explore the problem of learning from an evaluation oracle, which provides an estimate of function values, and introduce an extensive yet intuitive framework that yields unconditional lower bounds for learning from evaluation queries and characterizes the query complexity for learning linear function classes. The framework is directly applicable to the QSQ setting and virtually all algorithms based on loss function optimization. Our first application is to extend prior results on the learnability of output distributions of quantum circuits and Clifford unitaries from the SQ to the (multi-copy) QSQ setting, implying exponential separations between learning stabilizer states from (multi-copy) QSQs versus from quantum samples. Our second application is to analyze some popular quantum machine learning (QML) settings. We gain an intuitive picture of the hardness of many QML tasks which goes beyond existing methods such as barren plateaus and the statistical dimension, and contains crucial setting-dependent implications. Our framework not only unifies the perspective of cost concentration with that of the statistical dimension in a unified language but exposes their connectedness and similarity.


Cross-Shape Attention for Part Segmentation of 3D Point Clouds

Loizou, Marios, Garg, Siddhant, Petrov, Dmitry, Averkiou, Melinos, Kalogerakis, Evangelos

arXiv.org Artificial Intelligence

We present a deep learning method that propagates point-wise feature representations across shapes within a collection for the purpose of 3D shape segmentation. We propose a cross-shape attention mechanism to enable interactions between a shape's point-wise features and those of other shapes. The mechanism assesses both the degree of interaction between points and also mediates feature propagation across shapes, improving the accuracy and consistency of the resulting point-wise feature representations for shape segmentation. Our method also proposes a shape retrieval measure to select suitable shapes for cross-shape attention operations for each test shape. Our experiments demonstrate that our approach yields state-of-the-art results in the popular PartNet dataset.


Supervised Learning in the Presence of Concept Drift: A modelling framework

Straat, Michiel, Abadi, Fthi, Kan, Zhuoyun, Göpfert, Christina, Hammer, Barbara, Biehl, Michael

arXiv.org Machine Learning

We present a modelling framework for the investigation of supervised learning in non-stationary environments. Specifically, we model two example types of learning systems: prototype-based Learning Vector Quantization (LVQ) for classification and shallow, layered neural networks for regression tasks. We investigate so-called student teacher scenarios in which the systems are trained from a stream of high-dimensional, labeled data. Properties of the target task are considered to be non-stationary due to drift processes while the training is performed. Different types of concept drift are studied, which affect the density of example inputs only, the target rule itself, or both. By applying methods from statistical physics, we develop a modelling framework for the mathematical analysis of the training dynamics in non-stationary environments. Our results show that standard LVQ algorithms are already suitable for the training in non-stationary environments to a certain extent. However, the application of weight decay as an explicit mechanism of forgetting does not improve the performance under the considered drift processes. Furthermore, we investigate gradient-based training of layered neural networks with sigmoidal activation functions and compare with the use of rectified linear units (ReLU). Our findings show that the sensitivity to concept drift and the effectiveness of weight decay differs significantly between the two types of activation function.


Short Term Prediction of Parking Area states Using Real Time Data and Machine Learning Techniques

Provoost, Jesper, Wismans, Luc, Van der Drift, Sander, Kamilaris, Andreas, Van Keulen, Maurice

arXiv.org Machine Learning

Public road authorities and private mobility service providers need information derived from the current and predicted traffic states to act upon the daily urban system and its spatial and temporal dynamics. In this research, a real-time parking area state (occupancy, in- and outflux) prediction model (up to 60 minutes ahead) has been developed using publicly available historic and real time data sources. Based on a case study in a real-life scenario in the city of Arnhem, a Neural Network-based approach outperforms a Random Forest-based one on all assessed performance measures, although the differences are small. Both are outperforming a naive seasonal random walk model. Although the performance degrades with increasing prediction horizon, the model shows a performance gain of over 150% at a prediction horizon of 60 minutes compared with the naive model. Furthermore, it is shown that predicting the in- and outflux is a far more difficult task (i.e. performance gains of 30%) which needs more training data, not based exclusively on occupancy rate. However, the performance of predicting in- and outflux is less sensitive to the prediction horizon. In addition, it is shown that real-time information of current occupancy rate is the independent variable with the highest contribution to the performance, although time, traffic flow and weather variables also deliver a significant contribution. During real-time deployment, the model performs three times better than the naive model on average. As a result, it can provide valuable information for proactive traffic management as well as mobility service providers.


Hidden Unit Specialization in Layered Neural Networks: ReLU vs. Sigmoidal Activation

Oostwal, Elisa, Straat, Michiel, Biehl, Michael

arXiv.org Machine Learning

We study layered neural networks of rectified linear units (ReLU) in a modelling framework for stochastic training processes. The comparison with sigmoidal activation functions is in the center of interest. We compute typical learning curves for shallow networks with K hidden units in matching student teacher scenarios. The systems exhibit sudden changes of the generalization performance via the process of hidden unit specialization at critical sizes of the training set. Surprisingly, our results show that the training behavior of ReLU networks is qualitatively different from that of networks with sigmoidal activations. In networks with K >= 3 sigmoidal hidden units, the transition is discontinuous: Specialized network configurations co-exist and compete with states of poor performance even for very large training sets. On the contrary, the use of ReLU activations results in continuous transitions for all K: For large enough training sets, two competing, differently specialized states display similar generalization abilities, which coincide exactly for large networks in the limit K to infinity.